Genetic analysis of Octopus cyanea reveals high gene flow in the South‐West Indian Ocean

Abstract Octopus cyanea (Gray, 1849), abundant in the South‐West Indian Ocean (SWIO), constitutes a vital resource for both subsistence and commercial fisheries. However, despite this socioeconomic importance, and recent indications of overfishing, little is known about the population structure of O. cyanea in the region. To inform sustainable management strategies, this study assessed the spatio‐temporal population structure and genetic variability of O. cyanea at 20 sites in the SWIO (Kenya, Tanzania, Mozambique, Madagascar, Mauritius, Rodrigues, and the Seychelle Islands) by complementary analysis of mitochondrial DNA (mtDNA) noncoding region (NCR) sequences and microsatellite markers. MtDNA analysis revealed a shallow phylogeny across the region, with demographic tests suggesting historic population fluctuations that could be linked to glacial cycles. Contrary to expectations, NCR variation was comparable to other mtDNA regions, indicating that the NCR is not a hypervariable region. Both nuclear and mtDNA marker types revealed a lack of genetic structure compatible with high gene flow throughout the region. As adults are sedentary, this gene flow likely reflects connectivity by paralarval dispersal. All samples reported heterozygote deficits, which, given the overall absence of structure, likely reflect ephemeral larval recruitment variability. Levels of mtDNA and nuclear variability were similar at all locations and congruent with those previously reported for harvested Octopodidae, implying resilience to genetic erosion by drift, providing current stock sizes are maintained. However, as O. cyanea stocks in the SWIO represent a single, highly connected population, fisheries may benefit from additional management measures, such as rotational closures aligned with paralarval ecology and spanning geopolitical boundaries.

The day octopus (Octopus cyanea Gray, 1849) is primarily benthic, residing in the shallow coastal reefs throughout the Indo-Pacific to the Hawaiian Islands and including the South-West Indian Ocean (SWIO), comprising the East African coast, Madagascar, Mauritius, Seychelles, and islands in between (Chande et al., 2021;Herwig et al., 2012;Norman, 1991).The species is characterized by a short lifespan of approximately 12-15 months (Guard, 2009;Van Heukelem, 1973), rapid growth (Van Heukelem, 1973, 1983) and early sexual maturity (Guard & Mgaya, 2002).Individuals are crepuscular and inhabit conspicuous shallow water "dens" during inactive hours or while brooding, making them susceptible to artisanal fishing efforts using hand tools while inactive (Benbow et al., 2014;Sobrino et al., 2011).Females are highly fecund (100,000-400,000 eggs) and reported to be more vulnerable to capture (Forsythe & Hanlon, 1997;Guard & Mgaya, 2002).The period from the brood's initial spawning and maternal care is approximately 20-30 days; at this point, the female dies, and the paralarvae hatch and enter the water column.The subsequent planktonic paralarval phase takes another 30 days until it reaches a critical size, and thereafter, paralarvae settle and take up a benthic habit (Guard, 2009;Guard & Mgaya, 2002).
Octopus species comprise a substantial share of the dietary protein in local communities in the SWIO, while also providing a valuable commercial fishery to local fishers and nations for trade with foreign collection companies (Rocliffe & Harris, 2016).Tanzania (1241 t), Madagascar (1087 t), and Mauritius (324 t) were the major producers and exporters of octopus species in the SWIO in 2015, with other regional countries landing sizable figures (Rocliffe & Harris, 2016;Sauer et al., 2021).Within the region, O. cyanea is the prominent octopus species caught by artisanal fishers, comprising ~99% of the total catch of the Tanzanian fishery (Guard & Mgaya, 2002), and 95% in Madagascar (Epps, 2007).Despite octopus species life histories seemingly conferring resilience to overfishing, the continual growth in the global market and increased harvesting pressure (including unreported and illegal fishing) have coincided with a decline in fishery productivity (FAO, 2020;Guard & Mgaya, 2002).
The local socioeconomic importance of O. cyanea directs consideration of stock structure for the establishment of effective management strategies.A single population genetic study of O. cyanea in the SWIO, based on mtDNA sequence data, reported some numerically small yet statistically significant interregional differentiation (Φ ST ), but the majority of pairwise Φ ST values were nonsignificant and there was no detectable isolation by distance (IBD) effect (Van Nieuwenhove et al., 2019).Given the notorious difficulty of interpreting such weak genetic structure in terms of contemporary connectivity (Waples, 1998), particularly when based on a single locus, Van Nieuwenhove et al. (2019) highlighted the need for further studies employing more powerful nuclear markers.
The aim of the present study was to utilize both nuclear microsatellite DNA markers and the mtDNA noncoding region (NCR) to further assess the genetic variation of O. cyanea populations across the SWIO.Samples were collected from 20 sites spanning six countries and the autonomous territory of Rodrigues, representing both the East African continental coastline and oceanic islands.This sampling also spanned oceanographic systems that could potentially affect larval dispersal across the region.Samples were collected during two time periods (2010-12 and 2020), permitting the assessment of the temporal stability of any resolved structure, a recognized powerful approach to interpreting the biological significance of low levels of genetic structure (Waples, 1998).The inclusion of mtDNA facilitated comparison with the findings of Van Nieuwenhove et al. (2019) while also providing insight into the respective properties of the NCR, which has been largely unexplored in cephalopods.

| Sampling
Octopus cyanea arm tips were sampled from 11 sites spanning the SWIO by artisanal fishers and direct sampling between 2010 and 2012.An additional 9 sites were sampled during a targeted survey of Tanzanian waters in 2020 (Table 1; Figure 1).All octopus arm tips were sampled from natural populations, immediately preserved in 95% ethanol, and frozen at −20°C when feasible.
Mismatch distribution, the frequency distribution of pairwise differences between haplotypes within a sample, and simulated distribution under a model of demographic expansion were compared using the sum of squared deviations (SSD) and Harpending's raggedness index (HRI) (Harpending, 1994) as a test statistic with significance assessed after 10,000 bootstrap replications (Felsenstein, 1985).
Time since population expansion was estimated by T = τ/2u (Rogers and Harpending, 1992).Mutation rate (u) of 2% per million years based on the mutation rate of COI in cephalopods was used as levels of genetic variability for NCR (ĥ = 0.273 ± 0. Markov-chain Monte Carlo (MCMC) was run using a single chain of 5 × 10 7 iterations, sampling every 5000 generations; the first 5 × 10 6 chains were discarded as burn-in.Bayesian skyline plots were generated using Tracer version 1.7 (Rambaut, et al., 2018).
Genetic structure was also assessed using the individual-based Bayesian clustering method in STRUCTURE version 2.3 (Pritchard et al., 2000).The analysis was run for three iterations, assuming the user inferred K values ranging from 1 to 21 (total number of sampled sites +1).Each run had a burn in of 10 6 steps and 10 6 MCMC repetitions for each model.Model parameter combinations of admixture/no admixture and correlated/independent allele frequencies were varied over multiple analyses performed with and without prior sample knowledge (site location), as recommended by Hubisz et al. (2009).The most probable value of K was estimated using L(k) using the online tool Structure Harvester (Earl & vonHoldt, 2012;Evanno et al., 2005).This was complemented by classical Bayesian self-assignment tests performed in GENALEX.

| RE SULTS
The final mitochondrial NCR dataset comprised 415 sequences (each of 437 bases) from 6 countries and the autonomous territory of Rodrigues in the SWIO, 206 of which were from individuals collected between 2010 and 12 across the SWIO, while the other 209 were collected exclusively from Tanzania in 2020 (Table 1).The microsatellite dataset consisted of 232 Tanzanian O. cyanea collected from 9 sites in 2020 and combined with a dataset of samples collected in 2010-12 spanning SWIO counties.The combined microsatellite dataset set included 962 individuals.

| Microsatellite analysis
Summary indices are presented in Table 1.

| Levels of genetic variation compared to other cephalopods
Genetic variability is crucial for maintaining sustainable yields and population adaptability (Kenchington, Heino, & Nielsen, 2003) The control region (CR)/Noncoding region (NCR) harbors the regulatory elements required for the replication and expression of the mitochondrial genome (Shadel & Clayton, 1997).However, as it does not code for a functional gene, it is typically expected to accumulate mutations more rapidly than other mtDNA regions, with this conferring high investigative power for population and phylogeographic analyses (Shadel & Clayton, 1997).The majority of investigations of CR/NCR divergence among marine invertebrates have focused on decapod crustaceans (Chu, Li, Tam, & Lavery, 2003;Diniz, Maclean, Ogawa, Cintra, & Bentzen, 2005;McMillen-Jackson & Bert, 2003;McMillen-Jackson & Bert, 2004), with only a handful of investigations assessing NCR variation in cephalopods (Aoki, Imai, Naruse, & Ikeda, 2008;Winkelmann et al., 2013).A comparison of NCR data from this study and COI data from Van Nieuwenhove et al. ( 2019) revealed nearly identical levels of variation, contrary to our prior expectation that the NCR may harbor higher levels of diversity.This aligns with other studies that have found low mutation accumulation rates at NCRs in other cephalopods (Aoki, Imai, Naruse, & Ikeda, 2008;Winkelmann et al., 2013).While interspecific data would be useful to test for factors such as hotspot saturation (Galtier, Enard, Radondy, Bazin, & Belkhir, 2006;Alter & Palumbi, 2009) or selective constraints, the data here indicate that the NCR is not a hypervariable region in this species.

| Implications for management and further research
It is important to note that levels of gene flow sufficient to limit population differentiation may fall short of the dispersal required to replenish harvested stocks (Hauser & Carvalho, 2008).Therefore, despite the low levels of genetic structure observed in the SWIO being compatible with high gene flow, there may be some contemporary independence of stocks significant for management.An additional consideration here is that the mtDNA results suggest that populations may not be at migration drift equilibrium.Contemporary connectivity may therefore be overestimated due to historical gene flow and genetic inertia.For these reasons, the resolution of spatial stock structure may be beyond the resolution of neutral genetic markers and would benefit from complementary analysis of markers under selection (Canino, O'Reilly, Hauser, & Bentzen, 2005).Future studies should therefore consider the analysis of genome-wide single nucleotide polymorphisms (SNPs) as these markers have already been shown to improve estimates of population and demographic parameters in other exploited cephalopods (Cheng et al., 2021).Such genomic analyses should be combined with studies of paralarval movement to identify and protect critical habitats, known as octopus replenishing zones.A deeper understanding of the spatial and temporal abundances of paralarvae could also direct dynamic management strategies.For example, the timing of seasonal closures could be aligned with critical times of the year when paralarval dispersal is more active.

From
415 individuals sequenced, 20 NCR haplotypes were resolved (GenBank accession PP448064-PP448083; Figure 2), defined by 19 transition and four transversion mutations.Haplotype 1 (357 individuals) was found to be ubiquitous across the SWIO, while haplotype 2 (found in 27 individuals) occurred in all sites but Rodrigues and Mozambique, and haplotype 3 (6 sequences) was found in low abundance in all countries except Kenya, Mozambique, and the Seychelles (Figure 3).Eleven private haplotypes were found: 8 in Tanzania, 2 in Madagascar, and 1 in the Seychelles.Haplotype 10 was observed in a single individual from the Seychelles and exhibited 6 TAN and MOZ) × (MAD, ROD, MAU, and SEY) African mainland × Islands based on the presence barrier currents −0.004 −0.001 (KEN, TAN, and MOZ) × MAD × (ROD and MAU) MOZ, IV) × KEN × (AN, BE) × ROD × MAU × SEY Note: Country and territory are indicated as: KEN: Kenya, TAN: Tanzania, MOZ, Mozambique, MAD: Madagascar, ROD: Rodrigues, MAU: Mauritius, SEY: Seychelles, AN: Andavadoaka, BE: Beheloke, and IV: Ivovona.The Parentheses indicate grouped samples.Significant departure of values from zero is indicated by *p < .05.TA B L E 2 Hierarchical AMOVA design of O. cyanea populations in the SWIO.F I G U R E 2 Median joining haplotype network of O. cyanea (N = 415) in the SWIO based on 437 bases of the mitochondrial NCR.Perpendicular dashes represent the number of mutations separating haplotypes, and circle size indicates the number of individuals with that haplotype.The Color represents the country of origin.F I G U R E 3 MtDNA NCR haplotype frequencies of O. cyanea (N = 415) across the SWIO.The radius of plots corresponds to the log 10 of the total number of individuals sampled from a region.mutational steps from other haplotypes identified in the Seychelles, Mauritius, and Tanzania.Haplotype diversity was found to range between 0.105 and 0.571 (average h = 0.273 ± 0.143 SE) at Mombasa and Kunduchi, respectively (Table1).The number of haplotypes identified per sample ranged from 1 to 6, with Ivovona being the only site where a single haplotype was found and Pembeni being the only site with 6 haplotypes.Significant deviations from neutral expectations based on Tajima's D and/or Fu's Fs were found in at least one site within each state/territory, except for sites in Kenya, Mozambique, and Rodrigues (Table1).Pooling samples by state/ territory, Fu's Fs were significantly negative everywhere except Kenya, Mozambique, and the Seychelles, while Kenya and Mozambique were the only sites that did not report a significantly negative Tajima's D (TableS1).A mismatch distribution constructed using all 415 sequences displayed a unimodal peak (Figure4).The hypothesis of sudden population expansion was not rejected for the overall pooled sample (HRI = 0.309, p = .589;SSD = 0.002, p = .492).The time since the expansion of O. cyanea in the SWIO was estimated at 171,700 years based on the value of τ = 3 and a substitution rate of 2% per million years.Bayesian skyline analysis indicated that the expansion of population size initiated ~10,000 YBP (Figure 5).Tests of genetic structure (Φ ST ) at the NCR locus reported significant differentiation in 16 of 190 comparisons, with Kunduchi and Beheloke contributing 8 and 5 significant pairwise comparisons out of 16, respectively (Table3).AMOVA reported nonsignificant inter-group differentiation for all defined groupings, including management units suggested by Van Nieuwenhove et al. (2019) (F CT = 0.011, p = .212).No temporal genetic variation was found between Tanzanian samples from 2010 to 2012 and 2020 (F CT = 0.012, p = .146).
Of 147 locus by sample Exact Tests performed on genotype frequencies, 68 deviated from Hardy-Weinberg expectations, all due to heterozygote deficits.No significant genotypic linkage disequilibrium was detected between any pair of loci in global tests.The total number of alleles per locus across sampling sites ranged from 4 (OC22) to 19 (ROC6 and OC18).All sites exhibited similar values for allelic richness (range 9.74 (±0.82 SE) to 11.08 (±0.89SE)) and expected heterozygosity (range 0.79 (±0.07 SE) to 0.86 (±0.03 SE)).Observed heterozygosity was found to vary more widely among sites, ranging from 0.51 (±0.10 SE) to 0.81 (±0.07 SE).In all samples, mean H E was greater than H O , resulting in positive F IS values between 0.033 and 0.376, with 19 out of 20 values being significantly greater than expectations (Table1).Across all samples, global microsatellite differentiation was statistically significant but numerically very small (F ST = 0.004, p = .005),with 43 of 190 ENA-corrected pairwise tests being significant (Table3).Of the pairwise F ST tests, the Kunduchi and Pembeni samples represented 10 and 9 significant results of 43, respectively, post null allele correction.There was no significant correlation between the observed pairwise F ST values (ENAcorrected) and geographic distance (R 2 = .006,p = .282-Figure6), with a lack of any geographical pattern also obvious from PCoA (Figure7).Temporal genetic variation between Tanzanian samples from 2010 to 2012 and 2020 was nonsignificant (F CT = 0.004, p = .167).There were no cases of significant variation between groups in any AMOVA design, including between the management units suggested by Van Nieuwenhove et al. (2019): F CT = <0.001,p = .413.The Bayesian clustering analysis produced a model of F I G U R E 4 Mismatch distribution curves under a model of sudden demographic expansion for O. cyanea (n = 415) in the SWIO based on 437 bases of the mitochondrial NCR.Filled bars: observed frequency of pairwise distribution.Black dashed line: expected distribution.HRI: r = 0.309, p = .589;SSD = 0.002, p = .492.F I G U R E 5 Bayesian skyline plot showing demographic change in effective population size of O. cyanea in the SWIO based on 437 bases of the mitochondrial NCR.The black line represents the mean log 10 estimate of effective population size.The gray shaded area represents the 95% highest posterior density interval.K = 1, and in agreement with this suggested lack of individual-based structure, assignment tests found low rates of self-assignment when samples were grouped by state/territory (7 groups: 23%) or treated independently (20 groups: 8%).

4
| DISCUSS ION The present study combined mitochondrial DNA and nuclear microsatellite DNA markers to assess the degree of spatial genetic structuring and diversity of O. cyanea populations throughout the SWIO.The mtDNA data revealed a high diversity of haplotypes sharing a shallow phylogeny across the region, with additional signals of historic population size fluctuations supported by demographic tests.F ST and Bayesian clustering analyses of microsatellite data identified an absence of clear genetic structure across the region, with a low level of significant "patchy" differentiation.While pairwise F ST tests did result in a number of significant differences, in all cases, the values were low (all <0.018, all but 13 tests below 0.010).They exhibited no obvious geographic pattern, association with oceanographic features (e.g.island versus mainland systems), or signal of Isolation-By-Distance.Overall, our findings indicate extensive gene flow and connectivity in O. cyanea populations throughout the SWIO.

F
Mantel test for correlation between geographical distance (km) and microsatellite genetic difference (F ST ) between all pairwise sample comparisons (R 2 = .006,p = .282).tests indicating historical population size fluctuations, are in agreement with population demographic patterns reported by Van Nieuwenhove et al. (2019).Van Nieuwenhove et al. (2019) attributed such population fluctuations to habitat changes during Pleistocene glacial and interglacial periods.During the Pleistocene epoch, sea levels were ~ 120 m lower than present day on at least 3 occasions

F
Principal co-ordinate analysis (first two axes) of pairwise microsatellite F ST values of O. cyanea samples from 20 sites in the SWIO.Color-coded symbols represent the state/ territory of origin.
zation and demographic expansion, with sea level rising in the ensuing interglacial periods.Mismatch distribution analysis suggested an estimated time of O. cyanea population expansion of ~171,700 YBP.However, if a 10-fold mutation rate correction is applied as proposed by previous studies (González-Wevar, David, & Poulin, 2011; Pardo-Gandarillas, Ibáñez, Yamashiro, Méndez, & Poulin, 2018; Mckeown, Watson, Coscia, Wootton, & Ironside, 2019; Healey et al., 2020), the time of expansion is estimated to be ~17,000 YBP.This rate-corrected estimate is supported by Bayesian skyline analysis, suggesting the onset of demographic expansion occurring ~10,000 YBP.While these estimates are not precisely congruent with one another, they roughly correspond to a post LGM expansion.Regardless of the exact timing, eustatic sea level fluctuations during the Pleistocene are proposed to have effected demographic changes in O. cyanea as observed across High gene flow across the SWIOPairwise Φ ST /F ST values from mtDNA and microsatellite data were numerically small and mostly nonsignificant, while individual-based analyses did not reveal any clusters overall, indicating a lack of genetic differentiation and thus suggesting high gene flow within the studied area.The pairwise Φ ST values here are consistent with values in the order of 0.01 reported byVan Nieuwenhove et al. (2019), where 89.1% of pairwise tests were found to be nonsignificant.Given that O. cyanea adults are sedentary, this connectivity is likely driven by paralarval dispersal.The species planktonic larval duration (PLD) is approximately 30 days, varying with temperature(Guard & Mgaya, 2002).Such a PLD in an ocean region where marine currents typically range from 20 to 30 Sverdrup (~0.11 m s −1 ) and vary in direction and intensity during the southern winter (including reversal of the North flowing Somali current) would facilitate passive longdistance dispersal of paralarvae over hundreds of kilometers(Ali & Huber, 2010;Schott, Xie, & McCreary, 2009).The lack of genetic structure reported for O. cyanea fits with the general pattern of geographically extensive gene flow reported for other cephalopod species with planktotrophic paralarval dispersal such as Loligo forbesii (Steenstrup, 1856)(Shaw, Pierce, & Boyle, 1999), Doryteuthis opalescens (S.S.Berry, 1911)  (Reichow & Smith, 2001), Loligo reynaudii (d'Orbigny,1839-1841)(Shaw et al., 2010), Doryteuthis pealeii (Lesueur, 1821)(Shaw et al., 2010), and Doryteuthis gahi (d'Orbigny, 1835) (McKeown, Arkhipkin, & Shaw, 2019).Spatial genetic structuring in cephalopods seems to occur where there is some form of oceanographic/physical barrier to dispersal (Sandoval-Castellanos, Uribe-Alcocer, & Díaz-Jaimes, 2007; Staaf et al., 2010; McKeown, Arkhipkin, & Shaw, 2019), and there are few (if any) major discrete hydrodynamic barriers in the WIO area covered by the present study.All O. cyanea samples were found to exhibit significant deficits of heterozygotes.In their SNP-based study of Illex argentinus, Chemshirova et al. (2023) described significant heterozygote deficits for all samples within a panmictic population.The authors attributed this to intra-annual pulses of recruitment generating ephemeral genetic differences among groups, mostly likely at the larval, early recruit stage, followed by mixing at later life history stages.Heterozygote deficits among adult samples of I. argentinus had previously been described byAdcock et al. (1999), with the authors excluding inbreeding or mixing of genetically distinct populations as causes.Similar patterns have also been reported in Adriatic Sea species of cephalopods byGaroia et al. (2004), which the authors linked to spawning at different times.In addition to generating heterozygote deficits within populations, such processes of pulsed and variable recruitment are likely to also drive genetic patchiness among areas, as described byCheng et al. (2021), who concluded that intra-annual recruitment pulses underpinned a low level of local patchy structure despite no macrogeographical structure in D. opalescens.Similar recruitment processes could explain the patchy, unstructured differentiation among the 43 significant pairwise tests reported here.Post settlement, O. cyanea becomes resident in their local habitat (~50 m), sults of this study indicate high connectivity of O. cyanea throughout the SWIO, fishery closures and other measures that affect their populations should be coordinated across geopolitical boundaries.The Collaborative Fisheries Management Areas in Tanzania represent an operational model that could be extended to other areas.This should also be part of a broader adaptive management strategy to address changes in ocean temperatures, currents, and other environmental variables that have an impact on paralarval dispersal patterns and to educate communities about the importance of protecting octopus species paralarvae.AUTH O R CO NTR I B UTI O N S Charles R. Treleven: Writing -original draft (equal); writing -review and editing (equal).Mary A Kishe: Data curation (equal); funding acquisition (equal).Mathew O. Silas: Investigation (equal).
Summary information for O. cyanea samples from the SWIO using the mitochondrial noncoding region (N = 415) and 7 microsatellite markers (N = 962): sample size (n), number of haplotypes (N All 962 individuals were Polymerase Chain Reaction (PCR) amplified at seven species-specific microsatellite loci (ROC1, ROC6, ROC17, ROC28, TA B L E 1 hap ), haplotype diversity (h), nucleotide diversity (π), Fu's Fs test (F S ), Tajima's D test (D), mean number of alleles (N A ), rarified allelic richness (15 diploid individuals) (A R ), observed heterozygosity (H O ), expected heterozygosity (H E ), and multilocus inbreeding coefficient values (F IS ).were then aligned by ClustalW in BIOEDIT Pairwise Φ ST values for the mitochondrial noncoding region (top matrix) and F ST values across 7 microsatellite markers following ENA correction (bottom matrix) among samples of O. cyanea from the SWIO.
TA B L E 3Note: In the lower diagonal, underlined values are significantly greater than zero following the ENA correction (95% bootstrap).Both analyses follow 10,000 permutations.Significant departure of values from zero is indicated by *p < .05 and **p < .01 in the top diagonal.